A Corpus-based Study of the German Recipient Passive

نویسندگان

  • Patrick Ziering
  • Sina Zarrieß
  • Jonas Kuhn
چکیده

In this paper, we investigate the usage of a non-canonical German passive alternation for ditransitive verbs, the recipient passive, in naturally occuring corpus data. We propose a classifier that predicts the voice of a ditransitive verb based on the contextually determined properties of its arguments. As the recipient passive is a low frequent phenomenon, we first create a special data set focussing on German ditransitive verbs which are frequently used in the recipient passive. We use a broad-coverage grammar-based parser, the German LFG parser, to automatically annotate our data set for the morpho-syntactic properties of the involved predicate arguments. We train a Maximum Entropy classifier on the automatically annotated sentences and achieve an accuracy of 98.05%, clearly outperforming the baseline that always predicts active voice (94.6%).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The “recipient passive” in West Slavic: A calque from German and its grammaticalization

Among the European languages with a participial passive, there are some that tend to distinguish dynamic passive and object resultative through the use of different auxiliaries, e.g. English, Italian or German. In German, the difference is obligatory, and the German difference between the werdenpassive and the sein-resultative had an influence on several neighboring languages (Polish, Sorbian, ...

متن کامل

Encoding Thematic Roles via Syntactic Functions in a German Treebank

One of the major purposes of annotated corpora is their potential for use as databases for linguistic research. An important design criterion for corpora specifically intended for this use is the need to encode a plurality of types of information, some of which are clearly interrelated. This need can conflict with the need to produce large corpora quickly, where the encoding of redundant inform...

متن کامل

Underspecifying and Predicting Voice for Surface Realisation Ranking

This paper addresses a data-driven surface realisation model based on a large-scale reversible grammar of German. We investigate the relationship between the surface realisation performance and the character of the input to generation, i.e. its degree of underspecification. We extend a syntactic surface realisation system, which can be trained to choose among word order variants, such that the ...

متن کامل

A Corpus-based Study of Lexical Bundles in Discussion Section of Medical Research Articles

There has been increasing interest in utilizing corpora in linguistic research and pedagogy in recent years. Rhetorical organization of different sections of research articles may appear similar in various disciplines, but close examination may show subtle differences nonetheless. One of the features that has been at the center of attention especially in recent years is the idiomaticity of a di...

متن کامل

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012